Modeling Superscalar Processor Memory-Level Parallelism
نویسندگان
چکیده
منابع مشابه
Optimum Instruction-level Parallelism (ILP) for Superscalar and VLIW Processors
Modern superscalar and VLIW processors fetch, decode, issue, execute, and retire multiple instructions per cycle. By taking advantage of instruction-level parallelism (ILP), processor performance can be improved substantially. However, increasing the level of ILP may eventually result in diminishing and negative returns due to control and data dependencies among subsequent instructions as well ...
متن کاملImproving Instruction Level Parallelism through Reconfigurable Units in Superscalar Processors
With reducing feature sizes, more transistors can be integrated on the chip. The increased transistor budget can be utilized to improve the instruction level parallelism (ILP) exploited from the processor. However, the transistors cannot be used to arbitrarily increase the processor width and size in the hope of exploiting better ILP. In this paper, we propose an architecture where the supersca...
متن کاملInstruction-level Parallelism in Asynchronous Processor Architectures
The Micronet-based Asynchronous Processor (MAP) is a family of processor architectures based on the micronet model of asynchronous control. Micronets distribute the control amongst the functional units which enables the exploitation of ne-grained concurrency, both between and within program instructions. This paper introduces the mi-cronet model and evaluates the performance of micronet-based d...
متن کاملInstruction Level Parallelism In Arm Processor
Now, single-processor performance improvement has dropped Limited amount of exploitable instruction-level parallelism in Load-store ISA: ARM, MIPS. ARM Processors In early 2015, ARM announced a suite of IP for Premium Mobile designs, capability deepens the window for instruction level parallelism. Abstract: Advanced modern processors support Single Instruction Multiple Modular arithmetic, SIMD-...
متن کاملTradeo s in Processor/Memory Interfaces for Superscalar Processors
The current scheme of dealing with data cache misses is not well-suited for superscalar processors. In this scheme, the processor is blocked by holding its clock low until the missing cache block can be fetched from memory and inserted into the cache. From the processor's viewpoint, the miss did not occur. From the user's viewpoint, the execution time was lengthened in direct proportion to the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Computer Architecture Letters
سال: 2018
ISSN: 1556-6056
DOI: 10.1109/lca.2017.2701370